Banking and online financial service providers face significant challenges due to financial fraud. Traditional fraud-detection methods are often inadequate because of imbalanced datasets, limited interpretability, and privacy concerns involving confidential customer information. This paper presents an explainable AI–based system for financial fraud detection designed to address these issues. The system employs the Light Gradient Boosting Machine (LightGBM) as the primary model, combined with SMOTE oversampling to mitigate class imbalance. Privacy is maintained by anonymizing sensitive features, including Personally Identifiable Information (PII), by temporarily adding and later removing attributes such as name_email_similarity before model training. Model transparency is achieved through SHAP (Shapley Additive Explanations), which offers feature-level interpretability for fraud predictions. The system is implemented as a web-based interactive dashboard using the Flask framework, enabling users to upload datasets, perform fraud detection, adjust detection sensitivity (via threshold tuning), and download a detailed fraud report. When evaluated on a real-world dataset, the system achieved an overall accuracy of 98.5%, an ROC-AUC of 0.89, improved privacy preservation, and enhanced interpretability through SHAP. The proposed solution provides a practical end-to-end framework that balances accuracy, transparency, and privacy protection, making it suitable for banking and fintech fraud-detection applications.
Introduction
The text discusses the growing challenge of financial fraud in the digital banking era and proposes an explainable, machine learning–based solution for fraud detection. Fraud is defined as intentional deception for unlawful gain and has become more sophisticated due to globalization, electronic finance, and digital payment systems. While digital services increase convenience, they also raise fraud risks by exposing weaknesses in authentication, user behavior, and transaction systems. Fraud prevention measures exist but are never foolproof, making fraud detection essential to identify suspicious activities in time.
Detecting fraud presents several challenges, including strict privacy regulations, highly imbalanced datasets where fraudulent transactions are rare, the lack of interpretability in high-accuracy “black box” models, the evolving nature of fraud tactics, and the need for scalable solutions that can handle millions of transactions daily.
The main objective of the project is to develop a centralized, explainable fraud detection system using machine learning. Key goals include protecting privacy through data anonymization, addressing class imbalance with SMOTE, building a fast and accurate LightGBM model, and using SHAP for model explainability. The system is deployed as a Flask-based web application that enables near real-time fraud detection and explanation, with performance evaluated using standard metrics such as accuracy, precision, recall, F1-score, and ROC-AUC.
Related work highlights approaches using federated learning and explainable AI, ensemble learning methods, traditional machine learning models, and emerging quantum machine learning techniques, all aimed at improving fraud detection accuracy while addressing privacy and interpretability concerns.
The methodology outlines a centralized system architecture consisting of data ingestion, privacy protection, preprocessing, model training, evaluation, explainability, and web deployment modules. The implementation uses anonymized transactional data, applies SMOTE to balance classes, preprocesses data through imputation, encoding, and scaling, and employs LightGBM for prediction with SHAP to explain model decisions. The overall system demonstrates an effective, interpretable, and scalable approach to financial fraud detection.
Conclusion
This study demonstrates that explainability and privacy can be effectively integrated into AI-based fraud detection systems to achieve accurate, transparent, and ethically responsible outcomes. The proposed system successfully meets the key objectives of high accuracy, strong interpretability, and robust privacy preservation, while also maintaining scalability for deployment in real-world financial environments.
References
[1] Tomisin Awosika, Raj Mani Shukla, And Bernardi Pranggono.” Transparency And Privacy: The Role Of Explainable AI And Federated Learning In Financial Fraud Detection”, 10.1109/ACCESS.2024.3394528.
[2] A. Pascual, K. Marchini, and S. Miller. (2017). 2017 Identity Fraud: Securing the Connected Life. Javelin. [Online]. Available: http://www. javelinstrategy.com/coverage-area/2017-identity-fraud.
[3] UKFinance. (2022). Annual Fraud Report 2022. [Online]. Available: https://www.ukfinance.org.uk/policy-and-guidance/reports-andpublications/annual-fraud-report-2022
[4] A. O. Adewumi and A. A. Akinyelu, ‘‘A survey of machine-learning and nature-inspired based credit card fraud detection techniques,’’ Int. J. Syst. Assurance Eng. Manage., vol. 8, no. 2, pp. 937–953, 2017.
[5] A. Srivastava, A. Kundu, S. Sural, and A. Majumdar, ‘‘Credit card fraud detection using hidden Markov model,’’ IEEE Trans. Depend. Sec. Comput., vol. 5, no. 1, pp. 37–48, Jan. 2008
[6] The Nilson Report. (Oct. 2016). [Online]. Available: https://www.nilson report.com/upload/content_promo/The_Nilson_Report_10-17-2016.pdf
[7] S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland, ‘‘Data mining for credit card fraud: A comparative study,’’ Decis. Support Syst., vol. 50, no. 3, pp. 602–613, Feb. 2011.
[8] R. J. Bolton and D. J. Hand, ‘‘Statistical fraud detection: A review,’’ Stat. Sci., vol. 17, no. 3, pp. 235–255, Aug. 2002.
[9] S. Kamei and S. Taghipour, ‘‘A comparison study of centralized and decentralized federated learning approaches utilizing the transformer architecture for estimating remaining useful life,’’ Rel. Eng. Syst. Saf., vol. 233, May 2023, Art. no. 109130
[10] A. Pumsirirat and L. Yan, ‘‘Credit card fraud detection using deep learning based on auto-encoder and restricted Boltzmann machine,’’ Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 1, 2018..
[11] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, ‘‘SMOTE: Synthetic minority over-sampling technique,’’ J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun. 2002.
[12] Dingling Ge, Shunyu Chang, “ Credit Card Fraud Detection Using Lightgbm Model”. Rel..,2020 International Conference on E-Commerce and Internet Technology(ECIT).
[13] Tomisin Awosika, Raj Mani Shukla, And Bernardi Pranggono.” Transparency And Privacy: The Role Of Explainable AI And Federated Learning In Financial Fraud Detection”
[14] Kuldeep Randhawa, Chu Kiong Loo (Senior Member, Ieee), Manjeevan Seera 2,3 (Senior Member, Ieee), Chee Peng Lim4, And Asoke K. Nandi5,6 (Fellow, Ieee) “Credit Card Fraud Detection Using Adaboost And Majority Voting”
[15] Sahil Somaji Kamble, Suyog Sudhir Pawar, Tejas Babasaheb Veer, Digambar M. Padulkar- “Fraud Detection using Quantum Machine Learning (QML)”
[16] Seunghyeok Oh, †Jaeho Choi, and ?Joongheon Kim “A Tutorial on Quantum Convolutional Neural Networks (QCNN)”
[17] Siddhartha Bhattacharyya, Sanjeev Jha,1, Kurian Tharakunnel, J. Christopher Westland “Data mining for credit card fraud: A comparative study”
[18] S. Jesus, J. Pombal, D. Alves, A. Cruz, P. Saleiro, R. P. Ribeiro, J. Gama, and P. Bizarro, ‘‘Turning the tables: Biased, imbalanced, dynamic tabular datasets for ML evaluation,’’ in Proc. Adv. Neural Inf. Process. Syst., 2022.